Quantum Walks on the Hypercube 
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Abstract 

Recently, it has been shown that one-dimensional quantum walks can mix more quickly than clas- 
sical random walks, suggesting that quantum Monte Carlo algorithms can outperform their classical 
counterparts. We study two quantum walks on the 71-dimensional hypercube, one in discrete time and 
one in continuous time. In both cases we show that the quantum walk mixes in (jt/4)n steps, faster 
than the ©(nlogn) steps required by the classical walk. In the continuous-time case, the probability 
distribution is exactly uniform at this time. More importantly, these walks expose several subtleties in 
the definition of mixing time for quantum walks. Even though the continuous-time walk has an o(n) 
instantaneous mixing time at which it is precisely uniform, it never approaches the uniform distribution 
when the stopping time is chosen randomly as in [ [\AKV0l | . Our analysis treats interference between 
terms of different phase more carefully than is necessary for the walk on the cycle; previous general 
bounds predict an exponential, rather than linear, mixing time for the hypercube. 



1 Introduction 

Random walks form one of the cornerstones of theoretical computer science. As algorithmic tools, they have 



been applied to a variety of central problems, such as estimation of the volume of a convex body [ DFK9 1 



LK99| ], approximation of the permanent [ IS89 , JSVO0| ], and discovery of satisfying assignments for Boolean 



formulae [ Sch99]. Furthermore, the basic technical phenomena appearing in the study of random walks (e.g., 
spectral decomposition, couplings, and Fourier analysis) also support several other important areas such as 



pseudorandomness and derandomization (see, e.g., [AS92, (§9,§15)]). 

The development of efficient quantum algorithms for problems believed to be intractable for (classical) 



randomized computation, like integer factoring and discrete logarithm [ Sho97 ], has prompted the investi- 
gation of quantum walks. This is a natural generalization of the traditional notion discussed above where, 
roughly, the process evolves in a unitary rather than stochastic fashion. 

The notion of "mixing time," the first time when the distribution induced by a random walk is sufficiently 
close to the stationary distribution, plays a central role in the theory of classical random walks. For a given 
graph, then, it is natural to ask if a quantum walk can mix more quickly than its classical counterpart. (Since 
a unitary process cannot be mixing, we define a stochastic process from a quantum one by performing 



a measurement at a given time or a distribution of times.) Several recent articles [AAKV01, ABN + 01 



NV00H have answered this question in the affirmative, showing, for example, that a quantum walk on the 
«-cycle mixes in time (nlogn), a substantial improvement over the classical random walk which requires 
&(n 2 ) steps to mix. Quantum walks were also defined in [ WatOl ], and used to show that undirected graph 



connectivity is contained in a version of quantum LOGSPACE. These articles raise the exciting possibility 
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that quantum Monte Carlo algorithms could form a new family of quantum algorithms that work more 
quickly than their classical counterparts. 

Two distinct notions of quantum walks exist in the literature. The first, introduced by [ ]AAKV01 



ABN + 01 , NVO0| ], studies the behavior of a "directed particle" on the graph; we refer to these as discrete- 
time quantum walks. The second, introduced by [ ]CFG01 ], defines the dynamics by treating the adjacency 
matrix of the graph as a Hamiltonian; we refer to these as continuous -time quantum walks. The landscape 
is further complicated by the existence of two distinct notions of mixing time. The first "instantaneous" 
notion [ ABN + 0l| , |NVO0| ] focuses on particular times at which measurement induces a desired distribution; 
the second "average" notion [ [\AKV0l| ], another natural way to convert a quantum process into a stochastic 
one, focuses on measurement times selected at random. 

In this article, we analyze both the continuous-time and a discrete-time quantum walk on the hypercube. 
In both cases, the walk is shown to have an instantaneous mixing time at (%/A)n. Recall that the classical 
walk on the hypercube mixes in time <d{n\ogn), so that the quantum walk is faster by a logarithmic factor. 
Moreover, in the discrete-time case the walk mixes in time less than the diameter of the graph, since Jt/4 < 1; 
and, astonishingly, in the continuous-time case the probability distribution at t = (%/4)n is exactly uniform. 
Both of these things happen due to a marvelous conspiracy of destructive interference between terms of 
different phase. 

These walks show i.) a similarity between the two notions of quantum walks, and ii.) a disparity 
between the two notions of quantum mixing times. As mentioned above, both walks have an instantaneous 
mixing time at time (n/4)n. On the other hand, we show that there is no time at which the continuous walk 



approaches the uniform distribution in the sense of QAAKV01| ]. Thus there are some real subtleties involved 
in defining mixing times for quantum walks. 

The analysis of the hypercubic quantum walk exhibits a number of features markedly different from 
those appearing in previously studied walks. In particular, the dimension of the relevant Hilbert space is, for 
the hypercube, exponential in the length of the desired walk, while in the cycle these quantities are roughly 
equal. This requires that interference be handled in a more delicate way than is required for the walk on 



the cycle; in particular, the general bound of [ |AAKV01[ ] predicts an exponentially large mixing time for the 
discrete-time walk. 

We begin by defining quantum walks and discussing various notions of mixing time. We then analyze 
the two quantum walks on the hypercube in Sections || and|5[ (Most of the technical details for the discrete- 
time walk are relegated to an appendix.) Finally, in Section ||, we discuss mixing times in the sense of 



[AAKV01]. 



1.1 Quantum walks and mixing times 

Any graph G = (V,E) gives rise to a familiar Markov chain by assigning probability l/d to all edges leaving 
each vertex v of degree d. Let P' u (v) be the probability of visiting a vertex v at step t of the random walk on 
G starting at u. If G is undirected, connected, and not bipartite, then lim^^ P' u exists^ and is independent of 
u. A variety of well-developed techniques exist for establishing bounds on the rate at which P' u achieves this 



limit (e.g., [ Vaz92]); if G happens to be the Cay ley graph of a group (as are, for example, the cycle and the 



hypercube), then techniques from Fourier analysis can be applied (see [Dia88]). Below we will use some 



aspects of this approach, especially the Diaconis-Shahshahani bound on the total variation distance QDS81| ]. 
For simplicity, we restrict our discussion to quantum walks on Cayley graphs; more general treatments 



of quantum walks appear in QAAKV01| , |CFG01| ]. Before describing the quantum walk models we set down 
some notation. 



In fact, this limit exists under more general circumstances; see e.g. [MR95] 
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Notation. For a group G and a set of generators T such that r = r _1 , we let X(G,T) denote the undirected 
Cayley graph of G with respect to T. For a finite set S, we let L(S) = {/ : S — > C} denote the collection of 
C-valued functions on 5. This is a Hilbert space under the natural inner product (f\g) = Y,sesf( s ) §( s )* • F° r 
a Hilbert space V, an operator U : V — > V is unitary if for all v, w G V, (v|vv) = (t/v|t/w); if t/ is represented 
as a matrix, this is equivalent to the condition that = U~ l where f denotes the Hermitian conjugate. 
There are two natural quantum walks that one can define for such graphs, which we now describe. 



The discrete-time walk. This model, introduced by [ ]AAKV0i[ |ABN + 01| , [NV00| ], augments the space 



L(G) with a direction space, each basis vector of which corresponds one of the generators in F. A step 
of the walk then consists of the composition of two unitary transformations; a shift operator which 
leaves the direction unchanged while moving the particle in the appropriate direction, and a local 
transformation which operates on the direction while leaving the position unchanged. To be precise, 
the quantum walk on X(G,T) is defined on the space L(G xT)= L(G) <g)L(r). Let {8 y | y G T} 
be the natural basis for L(T), and {8 g \g G G} the natural basis for L(G). Then the shift operator is 
5 : (8 g (g> 8y) i — > (8gy®5y), and the local transformation is D = 1 ® D where D is defined on L(T) alone 
and 1 is the identity on L(G). Then one "step" of the walk corresponds to the operator U = DV. If 
we measure the position of the particle, but not its direction, at time t, we observe a vertex v with 
probability P t (y) = Y,yer | (^Vo | 8 V <g> 8 y ) | 2 where \|/q G L(G x T) is the initial state. 



The continuous-time walk. This model, introduced by [ CFG01| ], works directly with L(G), the Hilbert 



space of C-valued functions on G: L(G) = {/ : G — > C}. The walk evolves by treating the adjacency 
matrix of the graph as a Hamiltonian and using the Schrodinger equation. Specifically, if H is the 
adjacency matrix of X(G,T), the evolution of the system at time t is given by U t , where U t = e' Ht (here 
we use the matrix exponential, and U t is unitary since H is real and symmetric). Then if we measure 
the position of the particle at time t, we observe a vertex v with probability P t (v) = |(£/ f \|/o|e v )| 2 where 
\|/o is the initial state. 

In both cases we start with an initial wave function concentrated at a single vertex u. For the continuous- 
time walk, this corresponds to a wave function 



1 if u = v, 
otherwise. 



¥«M = (Yh|S v ) 

For the discrete-time walk, we start with a uniform superposition over all possible directions, 



l/y|r| if u = v, 
otherwise. 



In order to define a discrete quantum walk, one must select a local operator D on the direction space. 
In principle, this introduces some arbitrariness into the definition. However, if we wish D to respect the 
permutation symmetry of the «-cube, and if we wish to maximize the operator distance between D and the 
identity, we show in Appendix [A| that we are forced to choose Graver's diffusion operator [ Gro96 ], which 



we recall below. We call the resulting walk the "symmetric discrete-time quantum walk" on the rc-cube. 



(Watrous [ ]Wat01| ] also used Graver's operator to define quantum walks on undirected graphs.) 

(Since for large n Graver's operator is close to the identity matrix, one might imagine that it would take 
Q.{n l l 2 ) steps to even change direction, giving the quantum walk a mixing time of « 3 / 2 , slower than the 
classical random walk. However, like many intuitions about quantum mechanics, this is simply wrong.) 

Since the evolution of the quantum walk is governed by a unitary operator rather than a stochastic one, 
unless P t is constant for all t, there can be no "stationary distribution" lim^^P;. In particular, for any £ > 0, 
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there are infinitely many (positive, integer) times t for which \\U' — 1|| < £ so that — \\f u \\ < £ andi 3 , is 

close to the initial distribution. However, there may be particular stopping times t which induce distributions 
close to, say, the uniform distribution, and we call these instantaneous mixing times: 

Definition 1 We say that t is an £-instantaneous mixing time for a quantum walk if \\P t — U || < £, where 

||A-H|| = i£|A(v)-H(v)| 

denotes total variation distance and U denotes the uniform distribution. 
For these walks we show: 

Theorem 1 For the symmetric discrete-time quantum walk on the n-cube, t = \k(%/4)n\ is an ^-instantaneous 
mixing time with £ = O (n~ 7 / 6 )for all odd k. 

and, even more surprisingly, 

Theorem 2 For the continuous-time quantum walk on the n-hypercube, t = k(n/4)n is a ^-instantaneous 
mixing time for all odd k. 

Thus in both cases the mixing time is &(n), as opposed to &(nlogn) as it is in the classical case. 

Aharonov et al. [ AAKVO H define another natural notion of mixing time for quantum walks, in which 



the stopping time t is selected uniformly from the set {0, . .. ,T — 1}. They show that the distributions 
P T = Y Y?t=o ft do converge as T —> oo and study the rate at which this occurs. For a continuous random 
walk, we analogously define the distribution Pt{v) = (1/r) f 0T P t (v)dt. Then we call a time at which the 
resulting distribution Pj is close to uniform an average mixing time: 

Definition 2 We say that T is an £-average mixing time for a quantum walk if \\Pj — U\\ < £. 

The exact relationship between instantaneous and average mixing times is unclear. In fact, while the 
continuous walk on the hypercube possesses O-instantaneous mixing times at all odd multiples of (n/4)n, 
the limiting distribution of Pj is not the uniform distribution, and we will show that an £ > exists such that 
no time is an £-average mixing time. For the discrete-time walk, the limiting distribution is uniform and we 



show that the general bound given in [ |AAKV01[ ] predicts an exponential, rather than linear, average mixing 
time for the hypercube. 



2 The symmetric discrete-time walk 

In this section we prove Theorem |]. We treat the «-cube as the Cayley graph of 7L n 2 with the regular basis 
vectors ej = (0, .. . , 1, .. . ,0) with the 1 appearing in the ith place. Then the discrete-time walk takes place 
in the Hilbert space L{Ul x [n]) where [n] = {1, . . . ,«}. Here the first component represents the position of 
the particle in the hypercube, and the second component represents the "direction" currently associated with 
the particle. 



As in [ AAKV01 , NV00], we will not impose a group structure on the direction space, and will Fourier 
transform only over the position space. For this reason, we will express an element \|/ in L{Ul) ®L(\p\) as a 
function *F : U\ — > C'\ where the ith coordinate of ^(x) is the projection of \|/ into 5? <8> 5,-, i.e. the complex 
amplitude of the particle being at position x with direction i. The Fourier transform of such an element *F is 
ip : _> C'\ where 

*®=E(-1)***(2). 
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Then the shift operator for the hypercube is 



5: ^(x)^£7l^(x0e,-) 
1=1 

where e, is the ith basis vector in the n-cube, and 71, is the projection operator for the ith direction. The 
reason for considering the Fourier transform above is that the shift operator is locally diagonal in this basis: 
specifically it maps S^(k) where 



V 



(-1)* 



o \ 



(-!)*■ / 



For the local transformation, we use Graver's diffusion operator on n states, D, 7 = 2/n — 8,-j. 

The advantage of Graver's operator is that, like the n-cube itself, it is permutation symmetric. We use 
this symmetry to rearrange = SjD to put the negated rows on the bottom, 



/ 2/n -I 
2/n 



2/n 
2/n - 1 



-2/n 



V 



2/n 



\ 



-2/n + l 
-2/n 



-2/n 
-2/n + l 



) 



where the top and bottom blocks have n — k and k rows respectively; here k is the Hamming weight of k. 

The eigenvalues of then depend only on k. Specifically, has the eigenvalues +1 and —1 with 
multiplicity k — 1 and n — k— I respectively, plus the eigenvalues X, X* where 



2k 2i 



1 V—Jk(n-k) 

n n 



and ($k £ [0,7c] is described by 



cos (£>k = 1 



2k 



sin (£>u = - \Jk(n — k) 



Its eigenvectors with eigenvalue +1 span the (k— 1) -dimensional subspace consisting of vectors with support 
on the k "flipped" directions that sum to zero, and similarly the eigenvectors with eigenvalue —1 span the 
(n — k— 1) -dimensional subspace of vectors on the n — k other directions that sum to zero. We call these the 
trivial eigenvectors. The eigenvectors of = e ±!(D * are 




We call these the non-trivial eigenvectors for a given k. Over the space of positions and directions these 
eigenvectors are multiplied by the Fourier coefficient (— 1)*'*, so as a function of x and direction 1 < j <n 
the two non-trivial eigenstates of the entire system, for a given k, are 



v?(x,j) = (-l) 




if kj = I 
if k, = 
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with eigenvalue e ,co *, and its conjugate v| with eigenvalue e !<B *. 

We take for our initial wave function a particle at the origin u = (0, ... ,0) in an equal superposition of 
directions. Since its position is a 8-function in real space it is uniform in Fourier space as well as over the 
direction space, giving 



This is perpendicular to all the trivial eigenvectors, so their amplitudes are all zero. The amplitude of its 
component along the non-trivial eigenvector is 




(1) 



and the amplitude of v| is a|. Note that \a-^\ 2 = 2~"/2, so a particle is equally likely to appear in either 
non-trivial eigenstate with any given wave vector. 

At this point, we note that there are an exponential number of eigenvectors in which the initial state has 
a non-zero amplitude. In Section |], we show that the general bound of Aharonov et al. [ AAKV01 ] predicts 
an exponential mixing time. In general, this bound performs poorly whenever the number of important 
eigenvalues is greater than the mixing time. 

Instead, we will use the Diaconis-Shahshahani bound on the total variation distance in terms of the 



Fourier coefficients of the probability [Dia88]. If P t (x) is the probability of the particle being observed at 
position x at time t, and U is the uniform distribution, then the total variation distance is bounded by 



U 



,2 1 
< - 



I 

k^(0,... 
^(1,... 



P,{k) 



1 n ~ 1 

H k=\ 



(2) 



Here we exclude both the constant term and the parity term k = (1,... ,1); since our walk changes position 
at every step, we only visit vertices with odd or even parity at odd or even times respectively. Thus U here 
means the uniform distribution with probability 2" _1 on the vertices of appropriate parity. 



To find P t (k), we first need *F f (fc). As Nayak and Vishwanath [NVOO] did for the walk on the line, we 
start by calculating the tth matrix power of £/-?. This is 

\ 



a+(-iy 


a 






a 


a + {-l)> 




c 






b-(-iy 


b 




— c 


b 


b-(-iy 



where 



COSGOi;? — ("I)' 



cosov + (-1)' 



and c 



sin (0^ 



n-k k yjk(n-k) 

Starting with the uniform initial state, the wave function after t steps is 



fy t (k) = -^= ( cos (Out + a / ——r srn ; cos ®kt ' 
Jn \ V n — k 



n — k 



■ sin (o^t 



(3) 



n—k 
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We could, at this point, calculate *P r (x) by Fourier transforming this back to real space. However, this 
calculation turns out to be significantly more awkward than calculating the Fourier transform of the probabil- 
ity distribution, P t (k), which we need to apply the Diaconis-Shahshahani bound. Since P t (x) = *P f (x)*F ( (3c)*, 
and since multiplications in real space are convolutions in Fourier space, we perform a convolution over ZJJ: 

P t {k) = -%(k®k') 



where the inner product is defined on the direction space, u ■ v = Y!i=i u i v *i- We write this as a sum over j, 
the number of bits of overlap between k' and k, and Z, the number of bits of k! outside the bits of k (and so 
overlapping with k © k'). Thus k' has weight j + Z, and k © kf has weight k — j + 1. 

Calculating the dot product *¥ t (k') ■ *¥ t (k®k!) explicitly from Equation ||] as a function of these weights 
and overlaps, we have 



P t (k) 



2" 



k n—k 

II 

7=0 Z=0 



n — k 
I 



cosco ;+ /f cos GOfc-y+jf +^ sinco ;+ /f sin(»fc_ J+ /f 



(4) 



where 



cos 00yt — cos 0)j + i cos (Hk-j+l 
sin(0 ; - + / sm(Ok-j+i 

The reader can check that this gives Pt(0) = 1 for the trivial Fourier component where k = 0, and Pt{n) 
(— 1)' for the parity term where k = n. 

Using the identities cosacosZ? = (l/2)(cos(a — b) +cos(a + b)) and sinasinZ? = (1/2) (cos (a — b) 
cos (a + b) ) we can re- write Equation ^ as 

k n—k 



P t (k) 



1 k n ~ k (k 

on \ , 

- y=0 /=0 \J 



n — k 
I 



1 —A 



cos (0 + t + 



l+A 



cosoo_? 



1 



»IIi , 

- y=0Z=0 \J 



F (5) 



where oo± = (Qj+i ± (£>k-j+i- 

The terms cosoo±? in 7 are rapidly oscillating with a frequency that increases with t. Thus, unlike the 
walk on the cycle, the phase is rapidly oscillating everywhere, as a function of either Z or j. This will make 
the dominant contribution to P t {k) exponentially small when t/n = Jt/4, giving us a small variation distance 
when we sum over all k. 

To give some intuition for the remainder of the proof, we pause here to note that if Equation |5] were an 
integral rather than a sum, we could immediately approximate the rate of oscillation of Y to first order at 
the peaks of the binomials, where j = k/2 and I = (n — k) /2. One can check that doOyi/d/c >2/n and hence 
dco + /dZ = doo_/dy > 4/n. Since \A\ < 1, we would then write 



P t (k) 



i k n-k fu 

^7 l_, 



2" 



;=0 Z=0 

which, using the binomial theorem, would give 

k 

\Pt(k)\ = 



n — k 
I 



1+e 



+ 



l + e 4*/» 



n—k 



k 2t , n-k 2t 
cos h COS — 



(6) 



In this case the Diaconis-Shahshahani bound and the binomial theorem give 

2 



i.2 1 

l^-^H 2 <4 



0<&<n 



*2f 
cos h cos 



n / - 2 



,2A" / ,2/Y 
2cos — + 1+cos 2 
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(a) Variation distance at time t as a function of t/n. (b) Probability as a function of Hamming weight. 

Figure 1 : Graph (a) plots an exact calculation of the total variation distance after t steps of the quantum walk for 
hypercubes of dimension 50, 100, and 200, as a function of t/n. At t/n = 7T./4 the variation distance is small even 
though the walk has not had time to cross the entire graph. This happens because the distribution is roughly uniform 
across the equator of the n-cube where the vast majority of the points are located. Graph (b) shows the probability 
distribution on the 200-dimensional hypercube after 157 ss (n/4)n steps. The probability distribution has a plateau of 
2~ 199 at the equator, matching the uniform distribution up to parity. Shown is the log of the probability as a function 
of Hamming distance from the starting point. 




If we could take t to be the non-integer value (n/4)n, these cosines would be zero. 

This will, in fact, turn out to be the right answer. But since Equation ||is a sum, not an integral, we have 
to be wary of resonances where the oscillations are such that the phase changes by a multiple of 2% between 
adjacent terms, in which case these terms will interfere constructively rather than destructively. Thus to show 
that the first-order oscillation indeed dominates, we have a significant amount of work left to do. The details 
of managing these resonances can be found in Appendix [B| The process can be summarized as follows: 
i.) we compute the Fourier transform of the quantity Y in Equation ||, since the sum of Equation || can be 
calculated for a single Fourier basis function using the binomial theorem; ii.) the Fourier transform of Y can 
be asymptotically bounded by the method of stationary phase. The dominant stationary point corresponds 
to the first-order oscillation, but there are an infinite number of other stationary points as well; so Hi.) we 
use an entropy bound to show that the contribution of the other stationary points is exponentially small. 

To illustrate our result, we have calculated the probability distribution, and the total variation distance 
from the uniform distribution (up to parity), as a function of time for hypercubes of dimension 50, 100, 
and 200. In order to do this exactly, we use the walk's permutation symmetry to collapse its dynamics 
to a function only of Hamming distance. In Figure 1(a) we see that the total variation distance becomes 
small when t/n = n/4, and in Figure 1(b) we see how the probability distribution is close to uniform on a 
"plateau" across the hypercube's equator. Since this is where the vast majority of the points are located, the 
total variation distance is small even though the walk has not yet had time to cross the entire graph. 



3 The continuous-time walk 

In this section we prove Theorem ||. Childs, Farhi and Gutmann [ ]CFG01 ] define quantum walks in a different 
way, in which the unitary operator is generated from a Hamiltonian H using Schrodinger's equation. If H 
is simply the adjacency matrix of the graph, then U t = e' Ht = 1 + iHt + (iHt) 2 /2 H giving a walk in 
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continuous time. The amplitude of making s steps is the coefficient (it) s /s\ of H s , which up to normalization 
is Poisson-distributed with mean t. They point out that this avoids the need to extend the Hilbert space of the 
particle with a direction space, and to define some local operation on it such as Graver's operator, in order 
to make the walk unitary. While this approach is less familiar in computer science, a quantum computer 
which is allowed to evolve in continuous time according to a certain Hamiltonian seems just as physically 
reasonable as one which uses a clock to evolve in discrete time as traditional computers do. 

In the case of the hypercube, this walk turns out to be particularly easy to analyze. The adjacency matrix, 
normalized by the degree, is 

where d is the Hamming distance. The eigenvectors of H and U t are simply the Fourier basis functions: 
if vt(x) = (— 1)*'* then Hv^ = (1 — 2k/n)vt and U t vt = q'^ 1 ^/") y? where we again use k to denote the 
Hamming weight of k. If our initial wave vector has a particle at x = (0, ... ,0), then its initial Fourier 
spectrum is uniform, and at time t we have 

%(k)=2-"/ 2 e it ( 1 -%). 
Again writing the probability P as the convolution of *P with *F* in Fourier space, we have 

P t (k) = ^%(f)X(k(Bk') = ±£ e 2iKl*ffi*'M> 
k' k' 

We write this as a sum over all possible overlaps j between k' and k, and overlaps / between k' and k®k! . 
Noting that k' = j + l and \k@ k*\ =k — j + l, this gives 

1 j=0l=0 n 

Finally, the Diaconis-Shahshahani bound on the total variation distance between P t and the uniform 
distribution is 



^)| 2 =(l+cos 2 ^V-l 



Astonishingly, at t = (n/4)n and its odd multiples, this gives a total variation distance which is exactly zero, 
showing that if we sample at these times the probability distribution is exactly uniform. Note that this is 
possible even when t < n since the continuous-time walk has some probability for taking more than t steps 
(and, in fact, paths with different numbers of steps interfere with each other). Thus the continuous-time 
walk has the same mixing time as the discrete-time one, but with such a beautiful conspiracy of interference 
that every position has an identical probability. This concludes the proof of Theorem ^[ For an alternative 
derivation based on hypercube's structure as a product graph, see Appendix |C]. 



4 Average mixing times 



In this section we discuss the mixing time as defined in [ ]AAKV0T ], where we choose to stop the quantum 



walk at a time t uniformly distributed in the interval [0,T]. As mentioned in the Introduction, this gives a 
probability distribution Pt = {1/T)£J~q P t . Since the Fourier transform is a linear operation, we can look 
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at the Fourier transform of Pj instead. In the case of the symmetric discrete-time walk, Equation |5] shows 
that for k > 0, the Fourier coefficient of Pj consists of a sum of oscillating terms proportional to cosoo±?. As 
T — > °°, these oscillations cancel, so we are left with just the constant term k = and Pj indeed approaches 
the uniform distribution. 

One could calculate an average mixing time for the symmetric discrete-time walk using the methods of 
Appendix [B| We do not do that here. However, we will now show that the general bound of [AAKV01] pre- 



dicts an average mixing time for the rc-cube which is exponential in n. The authors of that paper showed that 
the variation distance between Pt and the uniform distribution (or more generally, the limiting distribution 
limy^oo^r) is bounded by a sum over distinct pairs of eigenvalues, 

\\Pt-U\\<- £ ] -^— (9) 
1 i,j s.t. Xt^Xj \ Ki K J\ 

where a, = (Yo|v;) * s tne component of the initial state along the eigenvector v,-. Since this bound includes 
eigenvalues Xj for which aj = 0, we note that it also holds when we replace |a,| 2 with |a/a* |, using the same 



reasoning as in [AAKV01] 



For the quantum walk on the cycle of length n, this bound gives an average mixing time of o{n\ogn). 
For the n-cube, however, there are exponentially many pairs of eigenvectors with distinct eigenvalues, all of 
which have a non-zero component in the initial state. Specifically, for each Hamming weight k there are m 
non-trivial eigenvectors each with eigenvalue e' w * and e - '™*. These complex conjugates are distinct from 
each other for < k < n, and eigenvalues with distinct k are also distinct. The number of distinct pairs is 
then 

Taking = 2~ n l 2 / \fl from Equation [I] and the fact that \k\ — < 2 since the A,, are on the unit circle, 
we see that Equation |9| gives an upper bound on the £-average mixing time of size Q.(2 n /e). In general, this 
bound will give a mixing time of Q(M/e) whenever the initial state is distributed roughly equally over M 
eigenvectors, and when these are roughly equally distributed over co(l) distinct eigenvalues. 

For the continuous-time walk, on the other hand, Equation || shows that Pt approaches the average of 
cos k 2t/n. In fact, it is equal to this average whenever T is a multiple of (%/2)n. For k odd this average is 
zero, but for k even it is 

1 f % , k 2 k n 
ax cos x - 



r(i-§) 2 *! 

Since these Fourier coefficients do not vanish, Pt does not approach the uniform distribution even in the 
limit T — > oo. In particular, the Fourier coefficient of Pj fork = 2 is 

~ , , 1 f T ,2t 1 sin AT In 

P T (2) = - dfcos 2 - = -+ —L (10) 
T Jo n 2 ST/n 

This integral is minimized when T = 1. 12335 n, at which point Pr(2) = 0.39138+. Since Pj(2) is bounded 
below by this, it is easy to show that the total variation distance ||Pr — ^|| is bounded away from zero as a 
result. Thus there exists £ > such that no £-average mixing time exists. 
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A Grover's diffusion operator 

In general, the selection of the local operator D on the direction space appears to introduce a certain amount 
of artificiality into the definition of a discrete-time quantum walk. If we ask, however, that the operator obey 
the permutation symmetry of the hypercube, then there is a one-parameter family of such unitary operators 
up to multiplication by an overall phase. 

To see this, suppose D is unitary and permutation-symmetric. Then it can have only two distinct entries, 
namely those on the diagonal and off it. Let D ;y = a if i = j and b if i ^ j. Then unitarity requires that 
\a\ 2 + (n — l)\b\ 2 = 1 and 2Reab*) + (n — 2)\b\ 2 = 0. The first of these two equations describes a circle, and 
their difference gives another, \a — b\ 2 = 1. The intersection of these circles gives at most two values for b 
which differ only by a phase (and by conjugation if a is real). Solutions exist when 1 — 2/n < \a\ < 1. 

To show that Grover's operator is the member of this family farthest from the family of diagonal unitary 
matrices {cl : \c\ = 1}, recall that the operator norm of a matrix A is ||A|| = Tr^A. Then the distance from 
D to this family is 

\\D -cl\\= n\a - c\ 2 + (n 2 - n)\b\ 2 = 2n(l - Reac*) 

When c has the same phase as a this is minimized at 2n(l — a), and this minimum is maximized when 
\a\ = 1 — 2/n. This corresponds to Grover's operator times an overall phase; in this paper we take a to be 
real and negative. 
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B Resonances in the discrete-time walk 



In order to evaluate Equation g, we use Fourier analysis again — this time on functions of j and I, or rather 
on the rescaled variables 

X = COS(Q, = l , y = COS(Q/ = l 

n n 

We Fourier transform the quantity Y in Equation |5[ Since we are interested in oscillations of frequency ©(/), 
we write 



-in(p x x+p y y) 



(11) 



so that as t goes to infinity, we may treat this as the integral 



Y(x,y) = J J Y$ x ,Vy)6 
Then, using the binomial theorem, we have 



-if(P x jc+Pyy) 



' j dp,dp v . 



/*,(*) = [[ dP x dP v F(P x ,P v )e-' Y (( 1 ^) pr+ ^^cos^cos"-* 
JJ n n 



(12) 



(13) 



We will show that Y peaks at values of p\- and p y corresponding to the first-order oscillation, namely 
(p\ r ,[3 v ) = (2,0) and (0,2). This gives a form similar to Equation^, so that if 2t/n = ji/2 the total vari- 
ation distance will be exponentially small. 



We calculate Y by inverting Equation 12, 



l r+i 



-l J—l 



dxdyY (x,y)e 



where the normalization is due to the range of x and y. We divide this integral into two terms, both of which 
are of the form 



dvdv I ] —^-) eosoo./e" |S ^» 



II 



dxdy 



i=fa 



(14) 



We can evaluate the right-hand integral in Equation 14 using the method of stationary phase, also known as 
steepest descent, which Nayak and Vishwanath [ NV00| ] use to find the asymptotic form of the wave function 
on the line. In general, if / is a slowly varying function then the asymptotic integral 



lim 

t— >°° 



f{x,y)e it6 >( x > y) dxdy 

is dominated by contributions from the points (x,y) in the domain of integration where (j) has zero gradient. 
(See, e.g., [BH75].) If r is the smallest integer such that the rth derivative of (j) at (x,y) is nonzero, we 
say that (x,y) is rth-order. In general, such asymptotic integrals are dominated by contributions from the 
stationary points of highest order. 

In Equation 14 the slowly varying function is (1 =pA)/2, and the phase function is 

$±(x,y) =ro ± + p x x + (3j,y 
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Its derivatives are 



dx 
3y 



1 



1 



sinco 7+ / sm(Qk-j+i 
sinco 7+ / smcOi:_ 7+ / 



For both (j) + and (()_, setting these to zero gives four stationary points (xo,yo), where the angles 00/+/, Wk-j+i 
are described by 



sin co ;+ / 



SU1 ©*-; + / 



COSCQ/+/ =^ +y - 1 = ±4/1 



IP,-Pyl 



(15) 



cos (*>*_,•+/ = 1 - — -x +yo = ±y 1 



Note that the signs of the cosines can be chosen independently, and all four possibilities exist for both (j) + 
and . Choosing both cosines to be positive gives 



yo 




+ 1 



+ ■ 



(16) 



The other three solutions are given by choosing one or both of the cosines in Equation |15| to be negative, 
which affects the signs of the square roots in Equation [l6[ For these solutions to be real, we require |3 y > 
|pjt| + 2 for the stationary points of (j) + , and p\ t > |pV>,| ± 2 for the stationary points of Thus p\ — fi x > 1 
for <|> + and fi x — p y > 2 for and in both cases p. v + p y > 2. 

To find the order of these stationary points, we calculate (()'s second derivatives at (xo,yo) : 



a 2 4>± _ 

3x 2 



3y 2 

a 2 4>± 



cosco 7 ' + / cosco,t-y+/ 
sin 3 co i+ / 
cosco 7+ / 



3x3y 3y3x 



sin 



sin J (Hk-j+i 
'(Qj+l sin 3 (Ok-j+i 



(17) 



Given the restrictions on p\ v and p\. for the stationary points to be real, for each of § + and <|)_ the second 
derivatives are zero at exactly one pair of frequencies, namely p\ r = and p\. = 2 for (j)+, and p\ v = 2 and 
p v = for (])_. We will call these the dominant stationary points. Note that at these frequencies we have 
iOj + i = iOk-j+i = ft/2 and the four stationary points coincide at the peak of the binomials in Equation || 
where j = k/2 and / = (n — k)/2. Moreover, these frequencies are exactly the first-order oscillations of Y 
appearing in Equation |6| 

Computing the third order derivatives at co ;+ / = (Hk-j+l = 7c/2 gives 



dx 3 
dy 3 



a 3 4>± 

dxdy 2 

9 3 4>± 
3x 2 3y 



1 



■ + 



3 COS 2 (B/+/ 



sin sin 5 ©/+/ 



1 



+ 



3 cos 2 (Qk-j+i 



sin 3 (Ok-j+i ' sin 5 (O k -j+l 



1 



Sin COy+i 



■ + 



3 cos (s)j + i 
sin 5 C0 7+ / 



| 3 cos 2 (Qk-j+i 

sin 3 ®k-j+l sin 5 C0fe_y+; 



-1±1 



1±1 
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Thus the dominant stationary points are third order, and in their vicinity ty± takes the form 

<$, ± = l -(-(x + y) 3 ±(x-y) 3 ) + 0(x\y 4 ) 

Thus if we rotate ji/4 to new variables a = x + y and b = x — y, we transform (j) into the sum of two decoupled 
functions in the vicinity of the dominant stationary point, and write the integral of Equation |l4| as the product 
of two one-dimensional integrals. For one-dimensional integrals with a third-order stationary point xq, this 
takes the form flBH75j §7] 

/r/i fi\ „i7tsgn(o)/6 
dx/(*)e*M = iW^/foje***) ■ , /3 + (r V3) 
t ' 3a' 

where a = §"'(xo) is the third derivative at xo- Since we have the product of two such integrals, and since 
/(jco) = (1 =FA)/2 = 0(1) and |a| = 2, the contribution of the dominant stationary point to P t (k) is 

[^)L_ t = O (r 2 / 3 (co^l + cos-*!)) (18) 

We now need to calculate the contribution of the other stationary points. These are second-order, and 
their contribution takes the form 

f f ?7T p'It8 V T /2 / i \ 

lim // dxdy/(x,y)e^) =^£/(x,y)e^) ° < 19 > 



where 9 2 (j).v. v is the matrix of second derivatives of (j) at and 8 x , y is +1, 0, or —1 depending on whether 
zero, one, or both of its eigenvalues are negative. From Equation [n| we have 



sm J (Oj+i sin" (Qk-j+i 



Focusing on the oscillating part of Equation 13, we have 



// 



dft.dB, e*^^ cos* ^ cos"-* (20) 
n n 



k\ k 

¥±(Px,Py) = ty±(xo,yo) - ( 1 - - Px - -P.v 



where 



n J n 

Since this really is an integral in the limit n — > °°, the cos*, cos"~* terms create sharper and sharper peaks 
where p\, p\, are multiples of 4. We can approximate \|/ at each peak to first order as a function of [3 r and p\ . 
For the stationary point of §± where the sign of both cosines is positive, \\r± is given by 



Its derivatives with respect to p\ T and p\, are 



a v± i( /, ( 2 v .// 2 y / _^ _*- 3 y b 



apx 2 v VP*+Py/ V VPx-P y y V « 




w = + v 1 "(p^p;) W°-;r^r^ (21) 
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and similarly for the other stationary points (xo,yo); we can also derive this directly from the definition of 
i|/± and the fact that we are at a stationary point of ty±. In other words, the derivatives of \|/ are proportional 
to the distance of the stationary points off the binomial peaks. 

The entire (p\, (3_ v ) -plane can be tiled with 4x4 squares centered on these peaks. Integrating Equation 20 
on one such tile, say around the peak $ x = 4p, P } . = 4q, gives 



4p+2 r 4q+2 



4p-2 J4q-2 



3V i 



dMP^^+l^) cos^cos-* 



n 



2 2 

2"t 2 
--o exp 



3fc 
Jo 



n \ k 



+ 1 



h 



K 2 n 2 
2/H 2 

-ln2 



n — k 
h 



■ exp(«Z) 



(22) 



where h(z) = — zlnz — (1 — z) hi(l — z) is the entropy function. Note that if the quantity Z in Equation ^2| 
is less than — ln\/2 for all stationary points other than the dominant ones, then their contribution to |P(&)| 2 
will be 2^7" where y > 1, in which case summing over all k will give an exponentially small contribution, 
0{2^ l ^ n ), to the total variation distance. To confirm this, note that Z is maximized by the other stationary 
points closest to the origin, such as the stationary point of § + , with both cosines positive, where p\ = 
and p\, = 4. From Equation 21 this gives 3\|//3[3 T = and 3\|//3p\, = y/3/2, and so jo = k/2 and Iq = 

( ( 1 — ^-)n — k)/2. Both binomials are non-zero only in the interval k € ^0, (1 — ^)n \ and Z is maximized 
at k = 0, where 




ln2 



-0.447 < In — 

V2 



-0.346 



The other second-order stationary points are this far or farther from the origin, giving values of jo and Iq 
farther off the binomial peaks, and therefore smaller entropies. 

Recalling Equation [l^ above, our final concern is the sum of the heights of these peaks, 

^ 1 



taken over all second-order stationary points (p\, p\). Since these occur when p\, p y are multiples of 4, from 
Equation [I3| we have |cos(0 /+ /cos(0 (t _ 7+ /| >3/4. Then 

3 



det3> ± (p\ r ,0 > 



and it is sufficient to show that the sum 



sin 3 (Oj+i sin 3 (£>k-j+t I 



A|p v + Pv | 3 |p,-p v | 3 



£ ip,+p v r 3 / 2 ip. v -p. v r 3/2 

converges. Again rotating by Ji/4 to variables a = (3* + (3 y and b = \i x — fi y , we get the sum 

2 



a.b 



< 



-3/2 
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Observing that Y,a>o a ~ 3 ^ 2 converges shows that the contribution of the second-order stationary points is 
exponentially small. 

Now we return to the dominant contribution to P t {k), Equation 18. If we could have t = (%/4)n exactly, 
this dominant term would be zero, leaving us with the second-order stationary points and an exponentially 
small total variation distance. However, in the discrete-time walk t must be an integer. Setting t = \{%/A)n\ , 
we have cos2f/n : 
gives 



O (1/n). Using the binomial theorem and Equation 18, the Diaconis-Shahshahani bound 



\P,-U\ 



0<k<n 



2cos 2 *- + 2cos"- 
kl \ n n 



2t\ n ( 2 2A" 
2cos — j + M+cos — j -1 



o(n-y 3 ) 



and so the total variation distance is \\P t — U\\ = 0(n 7 / 6 ), completing the proof of Theorem 1. 



C A graph product derivation of the continuous-time walk 

As an alternate derivation for the continuous-time walk, we can calculate the wave function \|/ f directly by 
exploiting the hypercube's simple structure as a product graph. Let a x be the Pauli matrix ^ ° { l Q \ Then 
we can rewrite Equation ^ as 



1 " 

ff=-Vl®-»Oj®"-®l 

where the j'th term in the sum has a x appearing in the j'th place in the tensor product. Then using the identity 
(A B) (C (g> D) = AB CD, and the fact that e A+B = e A e B when A and B commute, we have 



U = e 



iHt 



7=1 



JtG x /n 



cos t/n isint/n 
isint/n cos t/n 



where A m is the tensor product of n copies of A. If \|/ = |0 • • • 0) = \0) m , then 



Vt = U t \\f 



cos-) |0) + (/sin-) |1 



and we see that the continuous-time walk is equivalent to n non-interacting one-qubit systems. Then the 
amplitude for observing the particle at a position x with Hamming weight x is 



\\f t {x) 



cos ■ 



i sin ■ 



which when f = k(n/4)n for k odd gives |\|/,(.x)| 2 = 2 the uniform distribution. 
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